An Adaptive Information Borrowing Platform Design for Testing Drug Candidates of COVID-19

Background There have been thousands of clinical trials for COVID-19 to target effective treatments. However, quite a few of them are traditional randomized controlled trials with low efficiency. Considering the three particularities of pandemic disease: timeliness, repurposing, and case spike, new trial designs need to be developed to accelerate drug discovery. Methods We propose an adaptive information borrowing platform design that can sequentially test drug candidates under a unified framework with early efficacy/futility stopping. Power prior is used to borrow information from previous stages and the time trend calibration method deals with the baseline effectiveness drift. Two drug development strategies are applied: the comprehensive screening strategy and the optimal screening strategy. At the same time, we adopt adaptive randomization to set a higher allocation ratio to the experimental arms for ethical considerations, which can help more patients to receive the latest treatments and shorten the trial duration. Results Simulation shows that in general, our method has great operating characteristics with type I error controlled and power increased, which can select effective/optimal drugs with a high probability. The early stopping rules can be successfully triggered to stop the trial when drugs are either truly effective or not optimal, and the time trend calibration performs consistently well with regard to different baseline drifts. Compared with the nonborrowing method, borrowing information in the design substantially improves the probability of screening promising drugs and saves the sample size. Sensitivity analysis shows that our design is robust to different design parameters. Conclusions Our proposed design achieves the goal of gaining efficiency, saving sample size, meeting ethical requirements, and speeding up the trial process and is suitable and well performed for COVID-19 clinical trials to screen promising treatments or target optimal therapies.


Background
COVID-19 has affected our lives in an all-round way since its first outbreak in 2019. Subsequent waves of case spikes have swept nearly every country, causing considerable morbidity and mortality as coronavirus, and its variants continue to spread and mutate [1]. According to Johns Hopkins' data, 222.5 million confirmed cases and 4.5 million deaths were reported till Sep 9, 2021 [2]. Approximately, 80% of COVID-19 patients had mild or moderate disease, while 14% of patients experience a severe disease course, with casefatality rates ranging from 0.3 to 7.2% of all confirmed cases [3][4][5]. Due to its high transmission with a basic reproduction number (R0) of between 1.4 and 7.23 [6,7] and substantial effects on disease burden, effective treatments are a major concern to fight against this pandemic.
ousands of interventional studies have been registered in ClinicalTrials.gov related to COVID-19, and this number is increasing progressively. Up to now, 11 therapies for COVID-19 have gained emergency use authorization by the US Food and Drug Administration (FDA) [8], inspiring the following rapid development for treatments of COVID-19. However, the particularity of pandemic disease inevitably complicates clinical trials, making it different from traditional randomized controlled clinical trials. It is mainly reflected in the following aspects: first is the timeliness [9,10]. A large number of cases and deaths in a short time poses unprecedented pressure on the conventional drug discovery paradigms. e screening of effective treatment drugs has a strong timeliness, calling for more innovative trial designs. Second is the exploration of new indications of existing drugs rather than De Novo drug design, such as the repurposed lopinavir-ritonavir [11]. In these circumstances, the toxicity profile of drugs has been well studied and efficacy evaluation is of top interest. A quick-start phase II trial can get more attention from researchers. e platform trial recommended by FDA [12] can well adapt to these characteristics. WHO also pointed out that integrating clinical trials of candidate therapeutics as part of the response during infectious disease outbreaks is increasingly recognized as important for screening potential drugs. ree is the transient case peak. Due to the strict control of the epidemic after case spike, the sample size of a trial will soon be greatly challenged after the epidemic remains stable, which brings great difficulties to the process of clinical trials, such as two already terminated trials for evaluating Remdesivir in China [13]. Another lesson that should be learned is that quite a few trials on the antiEbola virus began after the Ebola epidemic has alleviated [14]. erefore, when designing clinical trials for COVID-19, a timely response is necessary, which can be settled by introducing early stopping criteria and making full use of all available data, including information from other drugs' trials, to improve the efficiency of trials.
ere have been several platform designs for COVID-19, including RECOVERY [15], REMAP-CAP [16], ACCORD [17], SOLIDARITY [18], and others. ese platform designs are all parallel designs, in which interested drug candidates start recruiting patients at the same time (Saville and Berry [19], Yuan et al. [20], and Tang et al. [21]). eir control arms remain unchanged although effective drugs graduate, resulting in ethical problems of not assigning the latest effective drugs to patients. Moreover, concurrent comparison leads to insufficient use of historical control arm information. In consideration of the rapid outbreak of COVID-19, countless patients urgently need effective drugs, so it may be more important to give patients effective drugs as soon as possible. e sequential design may be more suitable under such circumstances. It compares a candidate drug with a standard of care (SOC) under a unified framework sequentially. If declared efficacious, the candidate drug will be added to the control arm and continued to compare with new drug candidates, making it possible for COVID-19 patients to always receive the latest treatments.
e treatment in the current control arm is either consistent with that in the previous control arm or experimental arm, thus information from previous stages can be borrowed. Furthermore, due to the variability of the coronavirus, SOC may rapidly evolve and the epidemic has a strong tendency to shift populations (older to younger and back again) during any study. us, the baseline effectiveness of SOC may drift over time. Such time trend calibration must be considered when modelling as well.
Borrowing information could improve the power of the trial and adaptive randomization is combined with it to allow more patients to be enrolled into the experimental arm, making it more efficient. Methods have been recently introduced to borrow information. Pocock et al. [22] considered the difference in model parameters between historical data and current data and regarded this difference as a random variable. Ibrahim and Chen et al. [23][24][25] proposed the power prior method, in which the prior is constructed by raising the likelihood of historical data to the prespecified power α. Chu et al. [26] used a calibrated method to measure the heterogeneity and determine α for binary endpoints. A number of improvements to the power prior method have been described in the literature [27][28][29]. Early stopping is another feature of the proposed design to cater to the timeliness characteristics of COVID-19. With early stopping for efficacy and futility, once there is enough evidence to declare effectiveness, drugs are graduated to the next stage or stopped, saving sample size and accelerating the development.
Due to the high infectivity and relatively low mortality of COVID-19, most trials choose time-to-event endpoints. Because the effect size of drugs is extremely limited in case of low mortality, the use of traditional binary endpoints will lead to an excessive sample size. In addition, using time-toevent endpoints can better reflect changes in disease status and cater to changing epidemic characteristics [30].
Based on the above considerations, in the framework of sequential platform design, we integrate power prior and time trend calibration into the platform design and extend it to time-to-event endpoints, thus proposing a COVID-19 sequential platform design that adaptively borrows information from previous stages based on the heterogeneity between stages. e proposed design allows two strategies: the comprehensive screening strategy that aims to screen all drugs that may be effective and the optimal screening strategy that aims to screen the most effective drug. e outline of this article is as follows: Section 2 is a detailed introduction of the proposed model. Section 3 has the simulation results and Section 4 is a specific example of the design. We conclude with a brief discussion in Section 5.

Method
In this study, we propose an adaptive information borrowing platform design, which sequentially enrolls patients to either the experimental arm or control arm and makes decisions after data are available. If one drug shows enough efficiency, it will graduate and be added to the control arm, after which a new drug arm will be open for recruitment. Borrowing information happens between the same treatments. e overall process of the trial is shown in Figure 1. Considering an exploratory trial with the endpoint being time to clinical remission, for experimental arm A, let T A denote the time from enrollment to clinical remission. e smaller the remission time, the faster the clinical remission reaches, thus the more effective the drug is. Assuming T A follows an exponential distribution with hazard θ A Let θ A follow gamma distribution θ A ∼ Gamma(a, b), then θ A has the posterior distribution: For control arm B, we assume the time from enrollment to clinical remission T B also follows an exponential distribution, thus the hazard θ B has the same distribution as that in the experimental arm A. So, we have the posterior distribution: where D B , m B , and T B are defined in a similar way as D A , m A , and T A . Suppose there are K interim looks, which occur when the number of enrolled patients reaches n 1 , . . . , n K . Because fewer enrolled patients in the early stage may lead to unreliable estimates, we start the interim analysis until n 0 patients are enrolled and perform an interim analysis for every n k patients enrolled, up to a maximum sample size of n K . At each interim analysis, if Pr(θ A > θ B |D A , D B ) ≥ C I , the drug in experimental arm A is declared effective and vice versa. After reaching the maximum sample size in each arm, if the risk in experimental arm A is lower than that in control arm B, that is Pr(θ A > θ B |D A , D B ) ≥ C F , the experimental arm A is declared more effective than control arm B. C I and C F are obtained by calibration.

Borrow Information.
e treatment in the current control arm is either the same as the historical control arm or experimental arm. Specifically, when the drug in the previous stage is effective, we add a "graduated" drug into the control arm, thus the treatment in the current control arm is the same as that in the historical experimental arm. When the drug in the previous stage is ineffective, we remain the treatment in the control arm unchanged, thus the treatment in the current control arm is the same as that in the historical control arm. erefore, we could use the power prior to borrow information for the current control arm B from previous stages. e power prior method uses the posterior of historical data as the prior of the current parameter. Assuming that the historical data is D H , the initial prior of the parameter θ is π 0 (θ), we have the power prior π(θ): where α is the parameter controlling how much to borrow from historical data. e hazard in the current control arm Given the current control arm data has the posterior distribution: Once all the candidate therapies are tested, two drug development strategies are applied: (1) Comprehensive Screening Strategy. is strategy aims to screen all drugs that may be effective, that is, all drugs that satisfy the following rules will be declared effective and enter the next stage:

Canadian Journal of Infectious Diseases and Medical Microbiology
Pr where c(·, ·) is the lower incomplete gamma function.
is strategy aims to screen the most effective drug, that is, the one with the highest efficacy will be declared optimal and enter the next stage. For example, if drug 1 satisfies rule (8), drug 1 + SOC will replace the old control arm and compare with the follow-up new drugs. At the end of the trial, the last drug that declares effective will be selected as the optimal drug.

Time Trend
Calibration. Platform designs that run over a relatively long period may face a baseline effectiveness drift [31], which is reflected in the different hazard ratios between stages. Modelling for such drift in the SOC over time is needed; otherwise, it would result in type I error inflation and power reduction. Here, we add a time trend calibration to measure the drift in different stages, thus α becomes data driven rather than prespecified. Specifically, two types of data are available at interim analysis: count data (the number of patients who achieve clinical remission) and survival data (time to clinical remission for each individual). We use chi-square statistic χ 2 to measure the heterogeneity for count data and t-statistic τ for survival data. Monitoring one indicator is not enough to reflect all the information, so we calculate α by synthesizing information from remission numbers and survival time: where c and ρ are the tuning parameters, which are calibrated by simulation to keep type I error in control. Larger χ 2 and τ indicate that the information between two stages is heterogeneous, thus α will be smaller and we nearly borrow no information and vice versa.

Adaptive Randomization.
After posterior inference based on borrowing information, patients are randomized to different arms. Traditionally, we use equal randomization in most cases. However, if a fixed allocation ratio of 0.5 is still used, it is easy to cause an imbalance in the amount of effective information between arms. erefore, the adaptive randomization is considered to balance information and maximize power, which is achieved by taking the allocation ratio as a function of the effective sample size. According to Hobbs et al. [32], we assume the relationship between sample size and precision is linear, then the effective sample size is approximately the effective sample size of the "borrowed" posterior distribution n B (Prec(θ|D B , D H )/Prec(θ|D B )) minus the sample size of the current control arm n B , which is calculated as follows: where At the beginning of the experiment, 1:1 allocation is used. Until a certain amount of information is accumulated, adaptive randomization is performed. n * A (t) and n * B (t) are the effective sample sizes of the experimental arm and the control arm during the midterm analysis t, and ESS(t) represents the estimated effective sample size of the control arm. R represents the number of remaining patients to be randomized. e aim is to balance the effective information between two arms. τ is the randomized allocation ratio. After adaptive randomized allocation, there is n * Because the effective sample size has no upper and lower limits, the range of τ defined by the above formula is not limited to [0, 1], which does not meet the actual requirements. erefore, the above formula is changed to posing a limitation to the range [p min , p max ]:

Trial Process.
Steps for implementing our proposed design are as follows: (i) Step 1: for the first drug, enroll n 0 patients and equally randomized to two arms. (ii) Step 2: collect data for the first drug and fit model (2), conduct interim analyses, and calculate

Simulation
In this section, we run simulations to evaluate the performance of the proposed design. Suppose a platform trial with 5 candidate drugs and 1 SOC. e risk of the control arm is set to θ B � 0.2. According to exponential distribution, the average time to clinical remission in the control arm is T � 1/θ � 5 weeks. Risk ratio HR > 1 means that the experimental arm has a greater risk than the control arm, that is, the mean remission time is shorter and the effect is better. Among the 5 drugs in each scenario, bold text indicates effective drugs, and the others are ineffective drugs. When the drug is effective, the probability of rejecting the null hypothesis represents power; while ineffective, the probability of rejecting the null hypothesis is the type I error. We control the type I error to 0.1 through simulation. For the platform design that does not borrow information, the allocation ratio is always 1:1. First, we study scenarios without baseline effectiveness drift by applying the comprehensive screening strategy. Table 1 lists the simulation results of the COVID-19 platform design using the power prior method with α fixed at 0.5. In all 6 scenarios, from the results of the Pr (reject H 0 ) for the two designs, we can see that the type I error is below 10%. When the drug is truly effective, the proposed design has the power of more than 85% and is higher than that without borrowing information. Pr (early stopping for efficacy) and Pr (early stopping for futility) show the probability of early efficacy/futility stopping. It can be seen that when the drug is truly effective, the probability of effective stopping in most scenarios is more than 65%. When the drug is highly effective, the probability of early stopping can reach more than 95%. While the drug is less effective than the control arm, the probability of futility stopping is about 58%, which allows effective or ineffective drugs to end the trial as soon as possible to speed up the new drug development and save sample size. However, when the efficacy difference of the candidate drug and the control arm is not significant, that is, HR � 1, because we cannot conclude that the drug is effective or ineffective, the trial continues. In terms of sample size, the actual sample size in the scenario with a high early stopping probability is much smaller than the prespecified sample size, in which we can save almost 80-120 patients. As can be seen from N A and N B , compared with nonborrowing method, the proposed design allocates more patients to the experimental arm. With the trial progressing, the control arm can borrow more information (shown by effective sample size), so the proportion of patients allocated to the experimental arm is also increasing. To further study the impact of the accumulated information on the allocation ratio, we compared the relationship between the proportion of patients assigned to the experimental arm and the effective sample size in each scenario, as shown in Figure 2. Figure 2 shows how the allocation ratio and accumulated information change with the progress of the trial. Taking scenario 1 as an example, candidate drug 1 has no accumulated information, so the allocation ratio is 0.5. When drug 2 is tested, because the control arm is still SOC, the information of drug 1 could be borrowed, thus more patients are assigned to the experimental arm in drug 2. Drug 2 is declared effective. At this time, when the control arm is drug 2 + SOC, the information of the drug 2 experimental arm in drug 3 can be borrowed, so the amount of information does not change much. From Figure 2, the effective sample size of drug 2 and drug 3 is approximately equal. In drugs 4 and 5, because the information in the control arm accumulated, the effective sample size continued to increase, and the proportion allocated to the experimental arm also increased. However, due to the limitation of the maximum allocation ratio of formula (5), the proportion allocated to the experimental arm was finally constant at around 0.85. From the relationship between the proportion of patients assigned to the experimental arm and the effective sample size above, we can see that the more information accumulated, the greater the effective sample size, thus the higher the proportion of patients assigned to the experimental arm, which meets the ethical requirements and maximizes power. e parameter α in power prior method for controlling the degree of borrowing information is recommended to be 0.5. Considering that different α may have different effects on statistical performance, we conduct a sensitivity analysis on α. Results are summarized in Supplementary materials. We can see that the proposed design is robust to different α in terms of type I error, power, and sample size.
Furthermore, scenarios with baseline effectiveness drift are discussed. We run simulations for platform design using time trend calibration compared to noncalibration, in which α does not need to be prespecified. Results are shown in Table 2 and Figure 3. e time trend in the third column represents the baseline hazard of SOC. e underlined text represents the baseline hazard drifts. From the results of the Pr (reject H 0 ) for two methods, we can see that the type I error is controlled at approximately 0.1 for time trend calibration when drift happens. While for the power prior, type I error is inflated due to the inconsistencies between stages. For example, in scenario 1, the parameter of time trend for drug 4 is 0.45, so SOC in drug 4 is more effective than that in others. Time trend calibration can identify such Canadian Journal of Infectious Diseases and Medical Microbiology heterogeneity and choose to barely borrow information from previous stages, which can be confirmed in Figure 3.
We can see that drug 4's effective sample size and allocation ratio are both much lower than drugs 2 and 3. However, the power prior still borrows information, leading to the inflation of type I error. As for power, when the drug is truly effective, time trend calibration rejects the null hypothesis with a probability higher than 85%. e power prior may wrongly borrow information and lessen the effect size, resulting in lower power. Based on the results above, we can conclude that the time trend calibration is more robust to the baseline effectiveness drift. When drift exists between stages, time trend calibration is strongly recommended. e advantage of the proposed platform design is also reflected in the switch from effectiveness evaluation to optimal drug screening. e simulation results for the proposed platform design with optimal screening strategy are shown in Table 3 and Figure 4. We can see that in general, the optimal screening strategy has the highest probability to choose the most effective drug in different scenarios. Specifically, in scenario 1, when there is a relatively large effect size, the probability of selecting the most effective drug 2 can be as high as 93.9%. Different from the effectiveness evaluation procedure shown above, once selected as the optimal drug, it will be added into the control arm only with SOC. erefore, the subsequent candidate drugs (drugs [3][4][5] in scenario 1 will be compared with drug 2 plus SOC, leading to a relatively high probability of early futility stopping. is shows an advantage of optimal screening design that drugs no better than the optimal drug will be excluded as soon as possible. Since the amount of borrowing information is a function of sample size in formula (10), arms with a high probability of early futility stopping have a smaller effective sample size.
Similarly, the upward trend of ESS in Figure 4 is not as obvious as that in Figures 2 and 3 because of the trade-off result between the smaller sample size and the accumulated amount of borrowed information. In other scenarios, with regard to different locations and sequences of optimal drug and different effect sizes, the proposed optimal drug screening procedure can always select the optimal drugs with the highest probability and early stop the trial for efficacy as long as there is enough evidence. e adaptive randomization rule can allocate more patients to the experimental arms, which is consistent with the previous simulation results.

Trial Illustration
e famous COVID-19 drug candidates registered on clinicaltrial.gov are taken as an example to illustrate the proposed design with the comprehensive screening strategy. Suppose, the five drugs to be tested are Lopinavir, Favipiravir, CD24Fc, Remdesivir, and hydroxychloroquine. eir true clinical remission times are (5, 5, 5, 3.33, 5). Drug 4 Remdesivir can actually shorten the remission time and other drugs are ineffective. en, the COVID-19 platform design is used to test 5 drugs sequentially. Results are shown in Figure 5.    Table 4. In the example, at the first stage, the probability that experimental arm A is more effective than the control arm B is 8.6%, so drug 1 is declared ineffective. Next stage, the control arm is still SOC, and the experimental arm is drug 2 + SOC. According to the posterior probability, that drug 2 is declared ineffective. e third stage is entered and drug 3 is found ineffective. In stage 4, the early stopping rule is triggered, so we end the fourth stage early and declare that drug 4 is effective. Drug 4 is added into the control arm and stage 5 is entered. At present, the control arm is drug 4 + SOC, while the experimental arm is drug 5 + drug 4 + SOC. e posterior probability of the experimental arm better than the control arm is 4.4%, so drug 5 is ineffective. e trial ends and we finally declare that of all 5 drugs, only drug 4 is effective. e total sample size of this trial is 874, which saves 126 patients compared with the traditional fixed design. From the example, we can see that the proposed design can stop early when the drug is sufficiently effective, speed up the trial process, save sample size, and meet ethical requirements.

Conclusions
Wave upon wave of COVID-19 outbreaks put heavy pressure on global disease burden and economics. Presently, there is nothing more important than controlling and ending the outbreak. Since no significantly efficacious treatment has been found yet, the development of new antivirus drugs is paramount to this end. In this situation, the traditional manner seems both time consuming and inefficient, so novel trial designs should be adopted to accelerate drug development. erefore, we propose a platform design that evaluates multiple drug candidates in a unified design framework. Two drug development strategies are discussed here: the comprehensive screening strategy and the optimal screening strategy. e proposed design is able to tremendously shorten the overall trial duration and save the sample size for the control arm. Furthermore, the platform design incorporates an early stopping rule for significantly efficacious drugs, allowing patients to gain access to promising treatments as soon as possible, which helps control the spreading of disease. Simulation studies show that the design has good performance and robustness to different parameter settings. We adopt the power prior and time trend calibration to borrow information between different drugs, and more robust methods can also work well, such as commensurate prior [33] and robust meta-analytic-predictive prior [34], to further improve the performance of the design.

Data Availability
e code used to support the findings of this study is available from the corresponding author upon request.